Sorting improves word-aligned bitmap indexes
نویسندگان
چکیده
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid (WAH) compression. These techniques are sensitive to the order of the rows: a simple lexicographical sort can divide the index size by 9 and make indexes several times faster. We investigate row-reordering heuristics. Simply permuting the columns of the table can increase the sorting efficiency by 40%. Secondary contributions include efficient algorithms to construct and aggregate bitmaps. The effect of word length is also reviewed by constructing 16-bit, 32-bit and 64-bit indexes. Using 64-bit CPUs, we find that 64-bit indexes are slightly faster than 32-bit indexes despite being nearly twice as large.
منابع مشابه
SBH: Super byte-aligned hybrid bitmap compression
Bitmap indexes are commonly used in data warehousing applications such as on-line analytic processing (OLAP). Storing the bitmaps in compressed form has been shown to be effective not only for low cardinality attributes, as conventional wisdom would suggest, but also for high cardinality attributes. Compressed bitmap indexes, such as Byte-aligned been shown to be efficient in terms of both time...
متن کاملReordering Columns for Smaller Indexes
Column-oriented indexes—such as projection or bitmap indexes—are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right order before sorting can reduce the number of runs by a factor of two or more. For many cases, we prove that the number of runs in table columns is minimized if...
متن کاملCompressing Bitmap Indexes for Faster Search Operations
In this paper, we study the effects of compression on bitmap indexes. Themain operations on the bitmaps during query processing are bitwise logical operations such as AND,OR,NOT, etc.Using the general purpose compression schemes, such as gzip, the logical operations on the compressed bitmaps are much slower than on the uncompressed bitmaps. Specialized compression schemes, like the byte-aligned...
متن کاملBetter bitmap performance with Roaring bitmaps
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...
متن کاملBreaking the Curse of Cardinality on Bitmap Indexes
Bitmap indexes are known to be efficient for ad-hoc range queries that are common in data warehousing and scientific applications. However, they suffer from the curse of cardinality, that is, their efficiency deteriorates as attribute cardinalities increase. A number of strategies have been proposed, but none of them addresses the problem adequately. In this paper, we propose a novel binned bit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Data Knowl. Eng.
دوره 69 شماره
صفحات -
تاریخ انتشار 2010